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Abstract 

Stable mixtures of cooperators and defectors are often seen in nature. This fact 
is at odds with predictions based on linear public goods games under weak selection. 
That model implies fixation either of cooperators or of defectors, and the former 
scenario requires a level of group relatedness larger than the cost/benefit ratio, being 
therefore expected only if there is either kin recognition or a very low cost/benefit 
ratio, or else under stringent conditions with low gene flow. This motivated us to 
study here social evolution in a large class of group structured populations, with 
arbitrary multi-individual interactions among group members and random migration 
among groups. Under the assumption of weak selection, we analyze the equilibria 
and their stability. For some significant models of social evolution with non-linear 
fitness functions, including contingent behavior in iterated public goods games and 
threshold models, we show that three regimes occur, depending on the migration 
rate among groups. For sufficiently high migration rates, a rare cooperative allele 
A cannot invade a monomorphic population of asocial alleles N. For sufficiently low 
values of the migration rate, allele A can invade the population, when rare, and then 
fixate, eliminating N. For intermediate values of the migration rate, allele A can 
invade the population, when rare, producing a polymorphic equilibrium, in which it 
coexists with N. The equilibria and their stability do not depend on the details of the 
population structure. The levels of migration (gene flow) and group relatedness that 
allow for invasion of the cooperative allele leading to polymorphic equilibria with the 
non-cooperative allele are common in nature. 
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1 Introduction 



In [ ] we studied evolution of behavior in group structured populations, in a population 
structure that we called two-level Fisher- Wright framework with selection and migration 
(21FW), which incorporates intra- and inter- group competition and migration. The meth- 
ods used in that paper allow for the study of arbitrary group interactions, that affect the 
reproductive success of group members as linear or non-linear functions of the number of 
group member exhibiting each possible behavior. Those methods allow for the analysis 
under an arbitrary strength of selection, but are limited to the problem of invasion of the 
population by a rare mutant allele. Here we add the assumption that selection is weak, and 
we obtain a characterization of all the equilibria and their stabilities in the same frame- 
work, as well as in a large class of biologically relevant generalizations. This allows us to 
identify three regimes, determined by demographic parameters of the model. (1) When the 
migration rate among groups is sufficiently high, invasion by a rare mutant social allele is 
not possible. (2) When the migration rate is sufficiently low, invasion happens and leads 
to fixation. (3) For intermediate values of the migration rate, invasion happens and leads 
to a polymorphic equilibrium. For some significant models of social interaction (e.g., it- 
erated linear public goods games with contingent cooperation [20, 7, 44], threshold payoff 
functions arising from instance from coordinated punishment [ , ]) all three regimes occur. 

We will consider a range of population structures that extend 21FW in various natural 
ways. In 21FW groups compete directly and more efficient groups split and replace less 
efficient ones (see [ ] for a detailed introduction, or Section 2 for a brief introduction). 
In Section 2 we extend the 21FW population structure in four ways. In Gen.l we will 
allow for a separation of the time scales in which groups compete and in which individuals 
reproduce. In Gen. 2 we will assume that groups may or not replace other groups, but 
more efficient groups will become larger than the less efficient ones, and contribute more 
migrants to the population, allowing cooperative alleles to spread. In Gen. 3 we will allow 
for arbitrary patterns of recolonization, for occasional formation of new groups, and for 
differential migration rates depending on phenotype, group size and group composition. 
And in Gen. 4 we will allow for more general reproductive systems than the Fisher- Wright 
one. 

Importantly, our characterization of the equilibria and their stability (Sections 4 and 
5) is essentially the same for all the population structures that are considered, suggesting 
a substantial level of generality. In particular, the conditions for invasion of a rare allele, 
obtained in [44] are extended to this large class of population structures, when selection is 
weak. This is important because in [ ] these conditions were used to show that cooper- 
ative and even strongly altruistic alleles can invade a group structured population under 
realistically modest levels of group relatedness (realistically high levels of gene flow; as 
reported, e.g., in [ ] (Tables 6.4 and 6.5), [ ] (Table 4.9) and [ ]), produced simply by 
viscosity, without any kin-recognition mechanism. Particularly noteworthy are two facts. 
First, conditions for invasion do not depend on the relation between the time scales in which 
individuals reproduce and groups compete (Gen.l). Group competition can, for instance, 
occur in the form of occasional direct conflict, which does not need to take place often (in 
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the time scale of the individuals generations) for selection to favor cooperative alleles. This 
follows from the fact that carriers of cooperative allelles only incur reproductive costs when 
cooperation is needed, i.e., at times of conflict, in this example. So that the time scales 
in which costs and benefits are relevant are both the time scale of group competion and 
therefore are the same. Second, even when each group creates a single group in the next 
generation, so that groups do not compete directly, and with regulation of the population 
occuring only at the group level (Gen. 2), the conditions for invasion are not changed, and 
invasion can occur under the same realistic conditions. This is due to the elasticity in the 
groups size, that allows for groups with more cooperators to be slightly larger. Since we are 
only considering weak selection, that enlargement effect will be small and hard to observe 
in practice. 

The tools that we provide here for computing equilibria/stability are easy to implement. 
And when groups are large they simplify into expressions that only depend on the scaled 
gene flow parameter, or alternatively on the standard group relatedness parameter. In this 
way we provide an easy way to analyze evolution of social behavior under weak selection 
in group structured populations. As we explained in [ ], one of the most commonly used 
approaches to that problem, the Taylor- Frank method [48, 46, 51, 12], assumes implicitly 
that marginal fitness functions are linear functions of the number of mutants in a group. 
In other words, the reproductive success of group members has to be described in good 
approximation by a linear public goods game. The same remark applies to any methodology 
in which one expresses the fitness of a focal individual in terms of partial derivatives with 
respect to the focal individual's phenotype and the phenotype of the individuals with 
whom the focal interacts, as in [40, 27, 28]. Our approach allows instead for arbitrary 
marginal fitness functions, that typically result from complex interactions involving several 
individuals [20, 7, 25, 3, 18, 12, 49, 2, 13, 8, 45, 36, 19, 4, 5, 6]. This feature of our approach 
also distinguishes it from and makes it much more applicable than approaches based on 
interactions between pairs of individuals (diadic interactions) [38, 26]. 

Our results help explain the common occurrence of polymorphic equilibria between co- 
operators and defectors in nature [19], [14], [39], [13]. As pointed out in [ ], when the 
reproductive success of group members is modeled as a linear public goods game and selec- 
tion is weak, invasion leads to fixation, without the possibility of polymorphic equilibria. 
That paper explored strong selection as an alternative way of obtaining polymorphic equi- 
libria. Our results show that the issue is not necessarily one of strong versus weak selection, 
but that the non-linear nature of the interaction in the groups may yield those equilibria 
even when selection is weak. 

2 Population structures 

We begin with a brief reprise of the 21FW framework (see [44] for further motivation and 
details). In 21FW haploid individuals live in a large number g of groups of size n, and 
are of two genetically determined phenotypic types, A or N. Generations do not overlap, 
reproduction is asexual and the type is inherited by the offspring. (Except when stated 
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otherwise, we suppose that mutations do not occur within the time scales being considered.) 
The relative fitness (w) of a type A, and that of a type N, in a group that has k types A, are, 
respectively, w£ = l+5v£ and = l+Sv^, with the convention that Vq = 0, i.e., Wq = 1. 
The quantities v£ and represent life-cycle payoffs derived from behavior, physiology, etc. 
The parameter 5 > indicates the strength of selection. When S > is small, selection 
is weak, and we will also refer to the payoff functions v£ and v^f in this case as marginal 
fitness functions. The creation of a new generation in the 21FW through inter- and intra- 
group competition, followed by migration at rate m, is described as follows. Fisher- Wright 
intergroup competition: Each group in the new generation independently descends from a 
group in the previous generation, with probabilities proportional to group average fitness 
Wk = few fc+(^ k ) w k ^ Fisher-Wright Intragroup competition: If a group descends from a group 
with k types A, then it will have i types A with probability P^ = bin(z | n, kw£/nwk), where 
the binomial probability bin(i|n, q) is the probability of i successes in n independent trials, 
each with probability q of success. Migration: Once the new g groups have been formed 
according to the two-level competition process, a random fraction m of the individuals 
migrates. Migrants are randomly shuffled. Note: The assignment of relative fitness to the 
groups in the fashion done above is a necessary and sufficient condition [ ] for individuals 
in the parental generation to have each an expected number of offspring proportional to 
their personal relative fitness. 

We denote by /&(£), k = 0, the fraction of groups in generation t that have 

exactly k types A. And we denote by p(t) = Ylk=i(k I ' n ) fk(t) the frequency of types A 
in the population. Because the number of groups is large, by the law of large numbers, 
(/o(£)j • • • ? fn(t)) evolves as a deterministic (non-linear) dynamical system in dimension 
n+1. Our main concern in this paper is in finding the stable and unstable equilibria for 
this dynamical system. This will be done under the assumption that selection is weak, i.e., 
6«1. 

21FW will be generalized in several ways in this paper. These generalizations are moti- 
vated by biological considerations and, under weak selection, create little additional mathe- 
matical difficulties. We motivate and introduce the generalizations in Gen.l - Gen. 4 below. 
We then summarize the assumptions that we require, in a more abstract fashion, at the 
end of the section. 

(Gen.l) Separation of generation time and competition time scales: In 21FW, 
each group produces in the next generation a number of offspring groups that has a Poisson 
distribution. This may not be realistic in several biological applications, and therefore we 
will make the following more general assumption. Each group in generation t produces in 
the next generation a random number of groups with a given distribution (that does not 
depend on the number of groups g) and has mean proportional to its average group fitness 
(wk if it has k types A and n — k types N). The numbers of groups created in generation 
t + 1 by the g groups in generation t are independent random variables, conditioned on the 
total number of groups in generation t + 1 to be again g. These assumptions allow one to 
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use the law of large numbers as above, and also in Section below. This extension of the 
framework allows for important applications, in which groups typically create a single group 
in the next generation, but occasional extinctions and recolonizations allow also for a group 
to create or 2 groups. This generalization allows for a separation of the time scales in 
which individuals reproduce (1 generation) and the possibly much longer time scale in which 
groups create new groups. The kinds of cooperative behaviors that we have in mind here are 
those that only take place when there is direct group competition (e.g., inter-group conflict, 
or competition among groups for space opened by another group's extinction). In this way 
the inter-generation time scale is not the same as the scale of cooperation and competition 
within and among the groups. Those time scales of cooperation and competition are the 
same and are given by the time scale of group reproduction/extinction. In the analysis in 
this paper, we never have to consider what the distribution of the number of groups created 
by a group is (only the mean of this distribution matters), so that this generalization incurs 
no mathematical cost at all. (This is also the case of all the results in [ ], including the 
results under strong selection, which therefore extend with no change to the population 
structures defined by Gen.l.) 

(Gen. 2) Variable group size, local regulation: To motivate a second generalization of 
the 21FW, consider the special case of the generalization introduced in the previous para- 
graph, in which each group creates exactly one group in the next generation. For instance, 
each group might be located in a different habitat. In this case the population structure is 
the well known infinite island structure. Now, each group has size n and generates exactly 
one group of size n. This implies that the average absolute fitness (expected number of 
offspring) of members of each group must be 1, and therefore each group must have the 
same relative fitness. But this is incompatible with the behavior of types A benefiting (in 
terms of reproductive success) other members of their group at a cost (in terms of repro- 
ductive success) to themselves that is lower than the benefit produced. In such situations, 
the average fitness of the members of groups with many cooperators must be larger than 
that of groups with few cooperators (cooperative and altruistic behavior should be defined 
as "benefiting the group" , or more precisely, benefiting the average member of the group, 
in reproductive terms). In conclusion: there is no room for cooperative/altruistic behavior 
in this population structure, as it stands. This is the setting of [46], which is often referred 
to for this conclusion. By assuming a fixed number n of members in each group, we force 
Wk to be independent of k. Biological populations often admit elasticities in population 
and group size that are not incorporated in this formulation [47]. But because we will only 
consider weak selection (5 « 1) in this paper, we can easily extend the framework to 
accomodate this sort of elasticity, without having to be concerned with the details, and 
without affecting the analysis. To see this observe two facts. First, the assumption that 
all the groups have size n in 21FW is just a mathematical idealization/simplification of a 
population in which there is a typical group size close to n most of the time. Second, all 
the fitnesses are different from 1 by a term of order 5. Keeping these facts in mind, we 
generalize now the 21FW framework, by allowing group size to be different from n, but 
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only with probability of order 5. For this purpose, we suppose that in each generation, 
each group is created with a random size that has a distribution that depends on the size 
and composition of its parental group in an appropriate way. If the parental group has size 
n, then the size of its offspring groups concentrates almost all the probability on the size 
n, except for a probability of order 5. For other sizes of the parental group, we suppose 
that its offspring groups have positive probability (even if 5 = 0) of having size n. Under 
these conditions in each generation all but a fraction of order S of groups will have size n. 
And following the history of any single group, it will have size n almost all the time, the 
exception being a fraction of generations of order 5. Intuitively, n is the typical group size, 
but ocasional deviations are allowed, where "ocasionaF is quantified by 5. Now, since the 
fitnesses Wk only differ from 1 by a term of order 5, this new framework can accomodate 
any such fitness function, meaning that the expected size of a group whose parent group 
has k types A and n — k types N will be precisely Wk- (Even if the framework only allows for 
group sizes n — 1, n and n + 1, this is sufficient flexibility to allow for any fitness functions 
of the kind that we are considering. This intuitive idea is checked in detail in 
Appendix A, where further biological considerations are also discussed.) Note that when 
5 = 0, as in Section 3, groups will always have size n, so that this generalization is of 
no consequence in that section. When 5 > is small this is not the case, but typically 
groups still have size n. And because the case of small S > can be analyzed as a small 
perturbation of the case with 5 = (separation of time scales) , again the generalization in 
this paragraph will be of no consequence in Sections 4 and 5, where weak selection will be 
analyzed in this way. 

(Gen. 3) Exceptional groups, exceptional migrants, weakly variable migration 
rates, variable number of groups: When groups vary in size, the migration patterns 
in large and small groups can differ from the typical ones. As in Gen. 2 above we can 
incorporate this generalization into the mathematical analysis provided that except for 
probabilities of order 5, groups will have size n and behave regularly. For the same reason, 
a fraction of order S of migrants may behave in a way that is different from the regular one, 
even when they come from groups of size n. There is no restriction on the migration patterns 
of the exceptional migrants. These considerations allow for instance for the inclusion in the 
framework of groups that temporarily have size 0, and may represent an empty habitat. 
Such an empty habitat may, for instance, be colonized by migrants from the migrant 
population, without having a parental group, or alternatively be colonized by a single 
parental group. The recolonization scheme will not affect our analysis and results, since 
in each generation only a fraction of order 5 of the groups may have been recolonized in 
this generation or descend from a group that was recolonized recently. In other words, if 
we select a group at random in some generation, it is likely (except for probability of order 
5) to come from a lineage of regular groups deep into the past. Similarly, we can allow 
some migrants to form a completely new group (which then has no parental group) , but we 
suppose that in each generation, only a fraction of order 5 of migrants behaves in this way. 
Moreover, all the individuals may migrate at rates that depend on their type and that of 
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the other members of their group, but for groups of size n this variability must be within 
bounds of order S. This means that the migration rate of an individual of type * (A, or 
N), in a group with a total of k types A and n — k types N can be of the form m(l + Su^.), 
where u* k are arbitrary. Some of the assumptions above imply that the number of groups is 
not constant, and it is natural to assume that it is regulated by ecological forces towards 
the typical number g. This idea can be incorporated into the mathematical framework, by 
assuming that, if in generation t the number of groups is g t , then the fitness of the groups 
is multiplied by a regulation factor H(g t /g), where H(l) = 1 and —1 < dH(s)/ds < 0. 
(These assumptions on H imply that when the number of groups is smaller than g it tends 
to grow and when it is larger than g it tends to shrink, and that it does not tend to 
evershoot the target value g.) Under this assumption, there is no need to condition (as 
we did in Gen.l) on the total number of groups being created in a given generation. We 
instead simply suppose that each group creates a number of groups with a distribution that 
only depends on its group fitness (with the factor H(g t /g), common to all the groups). 

(Gen. 4) General intragroup reproduction system: Biological considerations may 
require the distribution of types of members of a group to depend on the distribution of 
the types of the members of the parental group in a way that is different from that given 
by the Fisher- Wright sampling = bm(i\n,kw£/nwk)- We can assume an arbitrary 
intragroup transition matrix P^, under the biologically meaningful conditions that when 
5 = 0, ^iiPk,i = k (neutral drift) and Pq 5 o = P n ,n = 1 (no mutations). Generalizing the 
population structure in this fashion produces some minor but cumbersome complications 
in (2), in Section 3. For simplicity we will only compute the more general expression in 
Appendix B. Other than this, all the statements in Section (no selection: S = 0) and 
Section 4 (weak selection: small 5 > 0) hold for a general intragroup transition matrix 
P^i. Additional, and again biologically meaningful, conditions on P^ will be needed for 
the results in Section 5 (case of large group size n and small migration rate m) to hold. For 
simplicity we will restrict ourselves to Fisher- Wright intragroup sampling when providing 
numerical examples in Section 6. But as we will explain in Section 5 and Appendix B, 
when n is large, these numerical results should also be good approximations under broad 
conditions on the intragroup sampling scheme. 

In the following sections the setting is always that of 21FW possibly generalized by 
Gen.l, Gen. 2 and Gen. 3. Everything will also apply under Gen. 4, unless stated otherwise, 
in which case the appropriate modifications needed are presented in Appendix B. Even 
additional generalizations would be covered by the results, provided that they satisfied the 
assumptions stated explicitly below. It may be surprising that so many different aspects of 
the population structure will produce almost no difference in the analysis and the results. 
This is a feature of weak selection. The key fact is that the reference population structure 
with 8 = is the same and that it can be easily analysed. The evolution under a small 
8 > is then a perturbation to this reference system, and to first order in 8 does not depend 
then on the details of the perturbation. This makes the concept of weak selection a very 
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powerful tool in shedding light on evolutionary patterns. (In contrast, the analysis under 
strong selection is highly dependent on the population structure.) 
The assumptions used in Section 3 are: 

Assumptions about the reference population structure (5 = 0): (1) From gener- 
ation t to generation t + 1, each group produces independently (possibly conditioned on 
the total number of groups produced being g t+ i = g) a random number of groups with 
a common distribution with mean H{g t /g) ) where H is a regulation function that has 
H(l) = 1 and has derivative —1 < dH(s)/ds < (as in Gen. 3). (Note that if we condition 
on g t = g, in each generation, then each group must produce in the average one group in 
the next generation.) (2) Each group being created has n individuals, and the number i of 
individuals of type A in this group, conditioned on the number k of individuals of type A 
in its parental group, is given by a transition matrix P{k ) i) that satisfies ^ iPk,% — k and 
Po,o = Pn,n — 1 ( see Gen. 4 for motivation). (The case of Fisher- Wright sampling is defined 
by P(fc, i) = bin(i|n, k/n).) (3) Once a new generation is created, migration occurs at rate 
ra, with migrants being randomly shuffled. 

The assumptions that are required for the perturbation approach used in Section 4 are 
also simple to state: 

Assumptions on weak selection (of strength 5): (1) Suppose that a focal individual 
is selected at random in a given generation, and we observe how many individuals of type 
A and of type N are in the focal's group. We assume that except for probabilities of 
order S there is no difference in taking this sample from the actual population or from 
the reference population with 5 = 0. (Formally, we are requiring the distance between 
the two distributions in total variation to be at most of order 5. This means that the 
probability of finding a given composition of the focal's group, when computed using the 
actual population or the reference population, only differ by an amount of order 5.) (This 
is precisely what we observed to hold in Gen. 2 and Gen. 3). (2) The average number of 
offspring of a randomly chosen focal individual who is in a group with a total of k types 
A and n — k types N is proportional to wl = 1 + Sv^, where * is the type, A or N, of the 
focal individual. (Note that we do not need to make any assumption about these relative 
fitnesses when the size of the focal's group is different from the typical size n.) 

The separation of the conditions in two sets, one defining the restrictions on the 6 = 
population structure, and one specifying how u 5 effects " are allowed to affect it, is a central 
element of our analysis. The idea of separation of time scales associated to weak selection, 
that we will exploit in Section 4 is a standard one [40, 27, 41, 31], but in the literature one 
often assumes a population structure (Wright's infinite island structure is a very common 
choice) that is not affected (even slightly) when 5 is positive. (For instance this is the case 
in [ ], and as we explained in Gen. 2, for this reason the setting of that paper rules out the 
possibility of altruistic behavior.) But weak selection can be the result of a weak force that 
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affects the reproductive success of individuals through its slight effect on the population 
structure (e.g., rare extinction of groups due to natural disasters or warfare). This idea is 
incorporated in the very mild restrictions that are being imposed on the effects of order S. 
The complete flexibility in the kinds of S effects that can modify the population structure 
allows for a large range of natural and biological processes to be described in a concise 
simplified way. 

Should one generalize the population structure further? Two of the constraints on the 
reference (5 0) population structure that will be used in Section 3 are the fixed size 
of groups, and the completely random unstructured migration at rate m (each individual 
once born is a migrant with probability m independently of anything else, and migrants 
are randomly shuffled). Releasing these constraints to allow say for groups of other sizes 
in the reference population, or structured migration according to some migration matrix, 
would require important modifications in Section . Such modifications are amenable to 
some extent to the methods of population genetics, but substantially more eleborate than 
the case treated here. It is to avoid such technical issues that, up to effects of order 5, 
we restrict ourselves here to groups of fixed size and to unstructured migration. These 
are typical idealizations of population with groups sizes that fluctuate around some typical 
value, and have migration ranges that are not very short. 

3 Neutral drift 

Before studying the evolution of p(t) under weak selection, we consider the case in which 
selection is completely absent, i.e., S = 0. In this case, there is no bias towards types A or 
N. Hence, each individual in generation t is of type A with probability p(t — 1), and again 
by the law of large numbers, p(t) = p(t — 1). We can therefore regard p(t) = p as constant 
in the following computation. When a new generation is being born, each individual in a 
group has its type, independently, with probability 1 — m sampled from the parental group 
of its group, or, with probability m sampled from the metapopulation, where types A occur 
with frequency p. Therefore, assuming that the intragroup transition matrix is given by 
Fisher- Wright sampling (see Appendix B for the general case), the probability fk(t) that a 
group selected at random in generation t will have k types A will satisfy the recursion 

f k (t) = ]T •/;,,./• (/- i). (i) 

3 

where 

T jik = bin(fc|ra, (1 -m)(j/n) +mp). (2) 

This means that (/o(£), fn(t)) evolves as the probability distribution of a Markov chain, 
with transition matrix Tj^ and converges to its stationary distribution that we denote by 
(p(p) = (<£o(p)j ¥ ? n(p)) 5 given by the stationarity condition 

<Pk(p) = ^Tjfitpjip), (3) 

3 
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and J^k^kip) — 1- (In the case of the infinite islands population structure, the results 
above can be found in [ ], p. 413.) Note that 



^2k(p k (p) = np, (4) 



since the left-hand-side is the expected number of types A in a randomly chosen group, 
there are n individuals in each group, and each individual is type A with probability p. 

Some extreme cases will play important roles later in the analysis. (See Appendix C, 
for an alternative explanation/derivation of the claims below.) Note first that (4) implies 
that 

<p(p) -> (1,0,. .,0,0), as p^O, (5) 
<p(p) -+ (0,0,. .,0,1), as p->l. (6) 

The case m = 1 clearly has 

(p k (p) = bm(k\n,p). (7) 

In the opposite extreme, when m 0, the irreducible transition matrix Tj^ converges to 
a transition matrix with traps at and n, so that 

p(p)->(l-p,0,...,0,p), as ra^O, (8) 

where we used (4) again. 

In each one of the equilibria, the average relatedness R between distinct members 

of a randomly chosen group takes the same value (Appendix D provides a short review of 
this and the other well known claims in this paragraph). To express it in a way that does 
not require Fisher- Wright intra-group reproduction (includes Gen. 4), we recall that the 
imbreeding effective group size, n e g, is defined as the inverse of the probability that, when 
a new group is created and before migration takes place, two randomly chosen members 
of this group have the same mother. In the case of Fisher- Wright intragroup sampling, we 
then simply have n e Q = n. With this definition, 

(1 -m) 2 1 
n e Q - (n e Q - 1)(1 - m) 2 l + 2mn e g' 

where the approximation is good when m is small. Here relatedness can be defined through 
lineages (identity by descent - IBD), regression coefficients, or Wright's F$t statistics. 



4 Weak selection 

When selection is weak, i.e., < 5 << 1, a well known separation of time scales argument 
[40, 27, 41, 31] applies in the following way. Starting with p(0) = p the population reaches 
the quasi-equilibrium with distribution (fk(p) (up to an error term of order S) in a time of 
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order 1/m, without feeling the effect of selection. But then selection produces changes in 
pit) at a rate of order 5. This is so because the fitnesses of types A and types N differ by 
an amount of order 5. To implement this idea, we write Ap(t) = pit + 1) — pit) and recall 
the well known formula 

WAp = p(W A -W) = p(l-p)(W A -W N ), (10) 

where W A and W N are the average fitnesses of types A and N, and W = pW A +(l—p)W N is 
the average fitness of all individuals (all in generation t, that is being omited in the notation, 
for simplicity). To compute W A and W N , we choose at random a focal individual from the 
population. We denote by /. a random variable that takes the value 1 if the focal is type 
A and if the focal is type N. And we denote by w 9 a random variable that equals the 
relative fitness of the focal individual. Then W A = E(w m \I. = 1) and W N = E(w m \I. = 0). 
To compute these conditional expectations, we also denote by K the random variable that 
gives the total number of types A in the focal's group. Then 

Pr(K = k\I. = 1) = Pr ( / » = 1 ^ = fc ) = Pr(/. = l|/r = fc)/ fc (t) = k_m_ 

p p np 

This means that if the focal is type A, it will have probability kfk(t)/np of being in a 
group with exactly k types A. (The factor fc, that tilts the distribution /&(£), can be easily 
understood as a bias term. Given that the focal is type A, it is more like that it was chosen 
from a group with many types A). An analogous computation shows that if the focal is 
type N, it will have probability (n — k)fk(t)/n(l — p) of being in a group with exactly 
k types A. Therefore, W A = E^AW^/np, and W N = J2k( n ~ k )fk(t)w* /n(l - p). 
When f k (t) = (p k (p), we have then W A = 1 + SV A (p) and W N = 1 + 6V N (p), where 

v A ( P ) = Z k k<p k (p)v£ 

np 

V N {p) = (13) 



77,(1 — p) 



And now from (10), 



Ap = 5p{l-p)[V A (p)-V N (p)] + 0{8 2 ). (14) 



This equation tells us that if we accelerate time by a factor 1/5 (so that the rescalled 
time is s = tS), then pit) will evolve as the solution x = x(s) to the differential equation 
dx/ds = x(l — x)[V A (x) — V N {x)] ) started from x(0) = p. The equilibria are the values of 
p for which 

D(p) = P (l-p)[V A (p)-V N (p)} = 0. (15) 

They are p = 0, p = 1 and the internal ones, where V A (p) = V N (p). Their stability is also 
readily obtained, by observing the sign of Dip) close to each one of them (decreasing Dip) 
^> stability, increasing Dip) ^ instability). Stability of the dynamical system can be 
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equivalently thought off as stability with respect to a small perturbation in the frequency 
of alleles, or with respect to a small but positive mutation rate [21] (see also SOM of [44], 
Section 9). 

In [ ], we were concerned with the stability of the equilibrium with p — 0, i.e., we were 
interested in determining when rare mutants A can invade a population of types N. This 
problem was solved there (for 21FW) with no assumption on the strength of selection. The 
simplification of that general solution in the case of weak selection was then provided in 
display (2) in that paper. (We review the content of that condition from [ ] in Appendix 
E, for use in Section 6.) That condition is therefore equivalent to the condition that 
lim p ^o[V A (p) — V N (p)] > 0. But from (5), we have lim p ^ V N (p) = Vq = 0, and the 
condition for invasion by rare types A becomes 

\hnV A (p) = Y (]i m ^M] v A > o. (16) 

k=l 

The case in which m = 1, in which groups assort randomly in each generation, is 
equivalent to the trait-group framework (see, e.g., Sec. 2.3.2 of [ ]). In this case (7) and 
simple manipulations with binomial coefficients transform (12), (13) and (14) into the well 
known forms 

n n 

V A (p) = ^bin(fc-l|n-l,p)^, V N (p) = J2 h ™(k-Mn-l,p)v»_ 1 , (17) 

k=l k=l 

n 

Ap = 5p(l-p)J2bm(k-l\n-l,p)(v A -vjt_ 1 ) + 0(S 2 ). (18) 

k=l 

And the condition (16), for invasion by rare types A, becomes simply v± > 0. Under 
the strong altruism condition v£ < v^_ x (meaning that each type A would be better off 
mutating into a type N), ( ) implies the well know result [32, 23] that, when m — 1, the 
only stable equilibrium is the one with no types A, and types N fixate when present. The 
same is therefore also true, under this strong altruism condition, when m is close to 1. 
In the opposite extreme, when m —> 0, using (8) we transform (12), (13) and (14) into 

V A (p) = vt V N (p) = < = 0, (19) 



Ap = 5p(l-p)v£ + 0(8 2 ). (20) 

This means that if v£ > 0, then for m close to 0, not only will rare types A invade the 
population, but they will fixate. The only equilibria are then the ones with p = and 
p = 1, the former being unstable and the latter stable. 

Consider now a model in which the following two conditions hold: 

(Cl) v?<0, (C2) v A >0. 
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These conditions are associated, for instance, to a social allele A, that promotes a certain 
behavior that it costly to an isolated actor, but that also provides a net benefit to each 
actor, when all members of the group behave in this way. In this case, from the conclusions 
in the last two paragraphs, we learn that there are two critical values of the migration 
parameter m, namely < ra/ < ra s < 1, which play the following roles. The critical 
point m S) already introduced in [ ], is the threshold value for types A to be able to invade 
when rare (for m > m s allele A cannot invade when rare). The critical value m/ is the 
threshold value for types A to fixate after invasion (for m < rrif allele A not only invades 
when rare, but it then fixates). (The subscripts V and T stand for 'survival' and 'fixation', 
respectively, of the allele A.) 

Because (9) establishes a one-to-one relationship between m and i?, and because it is 
common to measure m indirectly through i?, it is natural to also introduce R s and Rf as 
the values of R that correspond to m s and m/, respectively. 

In Section 6, we will see important examples for which rrif < m s (R s < Rf), so that 
there is an intermediate regime in which types A invade and evolve to a polymorphic 
equilibrium. This is nevertheless not the case, as is well known [16, 40, 39], for the linear 
public goods game, in which at a cost C to itself each type A contributes a benefit B 
shared by the other members of its group. As we review in Appendix F, for this model 
V A (p) -V N {p) = -C + BR does not depend on p, so that R s = Rf is the solution to 
Hamilton's equation C = BR. 

5 Limit of large group size and small migration rate 

We turn now to the case in which n is large and m is small. It is a classical result of S. Wright 
[54, 10, 50] (regarding generalization Gen. 4, the conditions on P^ for this result to hold 
are discussed in Appendix B - basically one only has to assume that individuals produce 
statistically independent numbers of offpring according to their fitnesses, conditioned on 
the appropriate total number of offspring in the group) that if we take the limit in which 

n —> oc, m —> 0, 2mn e Q — )► /, (so that, in particular, R — >> j^) , (21) 

then for every < y < 1, 



where beta(x|a, 0) is the density of a beta distribution with parameters a and /?, which up 
to a normalization constant is given by x a ~ l (l — x)^ -1 . This means that when we rescale 
k by dividing it by n, producing a variable x = k/n in the interval [0, 1], the distribution 
of x given by (fk(p) is close to that of a beta(x|/p, 1(1 — p)). Suppose we also have the 
approximations v£ ~ v^ n and ~ (uniformly in k) for some piecewise continuous 

functions v£ and v^, < x < 1. Then, for large n, ( ) and ( ) can be replaced by 




(22) 




(23) 
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V"{p) = / beta(x|/p,/(l-p) + l)^ v rfx. (24) 
Jo 

The changes in the values of the parameters a and f3 in the beta distributions that appear 
in (23) and (24), from those of the beta distribution in (22) are due to the bias terms fc/n, 
or (n — fc)/n, in (12) and (13), respectively. These yield factors x and (1 — x), respectively, 
which are incorporated into the density of the beta distribution, modifying its parameters. 
Note that the factors p and 1—p in the denominators in (12) and (13) are simply normalizing 
factors, and are absorbed into the normalization of the beta distributions. 

The condition (18) for invasion by rare mutants A becomes in the limit (21), 

lim V A (p) = [ beta(x|l,Z)S#dx > 0, (25) 
Jo 

or, equivalent ly, 

l 

(l-xY^v^dx > 0, (26) 



which has already appeared implicitly in display (3) in [ ], where it was derived in a 
different way. 

It is important to emphasize that the results in this section do not depend on the 
details of the population structure, including the reproductive system. Only the scaled 
population parameter mn e Q = 1/2, that represents gene flow, or, equivalently, the average 
group relatedness, i? = 1/(1 + 2mn e g), appears. In the case of the linear public goods 
game, v£ = —C + Bx, and ( ) is equivalent to Hamilton's rule C < BR. As we pointed 
out in [ ], it is natural to consider (26) as an extension of that rule to non-linear games. 
It is pleasant to observe how this combines two major ideas in evolutionary biology. The 
idea from population genetics of finding scaling parameters that summarize the relevant 
features of a large class of population structures. And Hamilton's key idea, that genetic 
relatedness could provide, under appropriate assumptions, the mathematical conditions for 
the proliferation of a cooperative allele. While both ideas have their limitations, the results 
in this section provide a case in which both are vindicated. 



6 Examples 

The conditions for equilibrium, stability and invasion, derived in the previous sections apply 
with equal ease to any marginal fitness functions v^. We focus in this section on two 
examples that are representative of some important biological mechanisms. 

In the first of these two examples (IPG), non-linearities in the marginal fitness functions 
v£ and result from the iteration of a basic game and contingency of behavior in each 
iteration. By assuming that the underlying base game is a linear public goods game (PG), 
one can analyze the effect of non-linearity resulting only from the iterative/conditional 
aspect of the life-cycle interaction among group members. This also makes the PG a 
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special (extreme) case of the class of models being considered (the case when the number 
of iterations is 1). 

In the second example (THR) we consider a very simple type of non-linearity in the 
marginal fitness functions v£ and v£ , that incorporates synergetic effects of cooperation, by 
requiring a minimum number of cooper ators for successeful outcomes, as well as saturation 
effects, when there are many cooper ators. We then also consider iterations of such a game 
and observe the effect of the number of iterations. 

The examples in this section illustrate some of our main concerns and conclusions. 
Non-linear marginal fitness functions associated to iteration and contingencies are natural 
in group structured populations, where group members live together for spans of time that 
are long compared to the time frame of their vital activities. For instance, many vertebrates 
live together and hunt together hundreds or thousands of times during their life-cycle. This 
gives plausibility to figures like the ones in the graphs below. For reasonable values of the 
parameters (group size, cost/benefit ratio for each iteration, number of iterations), we 
obtain often values of R s that are modest (below 10%) as compared to available data on 
Fst values [ ] (Tables 6.4 and 6.5), [ 15] (Table 4.9) and [ ] (see Appendix D for the relation 
between relatedness and Fst)- In contrast, the values of Rf are substantially higher. This 
indicates that coexistence of cooperators and defectors should not be surprising. 

The iterative nature of the games and the conditional behavior of the cooperators play 
central roles in allowing for low values of R s . One can only expect a modest value of 
i? 5 , if the benefits of cooperation in groups with many cooperators are substantially larger 
than the costs of cooperation in groups with few cooperators. This may be unlikely to 
happen with a single iteration of a biologically realistic game, due to physical and biological 
constraints. Also if a game is iterated, but behavior does not change from iteration to 
iteration, the same negative conclusion applies, since payoffs are then simply multiplied by 
the number of iterations. But once nature has endowed individuals with mechanisms that 
prompt them to change behavior based on past experience, the goal becomes realistic, as 
we see below. 

(IPG) Iterated public goods game. Types A cooperate conditionally: 



for constants 0<C<B,T>1 and aG{l,2,...,n — 1}. This example was first studied 
independently in [ , ], basically in the context in which m = 1. In this example a public 
goods game (PG) is repeated a random number of times r > 1, with average IE(r) = T. 
Each time each member of the group can cooperate at a cost C to itself, resulting in a benefit 
B/(n — 1) to each one of the other members of its group. Defectors incur no costs and 
produce no benefits. We suppose that types A cooperate in the first round, and afterwards 
only cooperate if at least a other members of the group cooperated in the previous round. 




-C + (k-l)B/(n-l), 
T(-C+(k-l)B/(n-l)) 

kB/(n- 1), if k < a, 
TkB/(n-l), if k > a, 



if k < a, 
if k > a, 
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This is often called a "trigger stategy" (with threshold a) and is a generalization of the 
well known tit-for-tat strategy, which corresponds to the case n = 2, a = 1. In [44] 
we focused on the case in which each type A cooperates in the first round, and in later 
rounds cooperates if and only if its payoff in the previous round was non-negative. This 
corresponds to a taking the smallest integer value that is larger than or equal to (n — l)C/B 
(since the PG has v£ < if and only if k < 1 + (n — 1)C/B). In this important case, 
we call the strategy of types A "payoff dependent contingent cooperation". The idea is 
that after an act of cooperation, each individual who cooperated receives a feedback, that 
indicates if cooperation should continue or not. If a negative value of v£ is associated with a 
negative feedback, then types A will discontinue cooperation precisely according the payoff 
dependent contingent cooperation rule. Types A in this example do not have to keep track 
of the identity of group member who cooperated or defected, and do not have to count how 
many cooperated. They only have to be predisposed to discontinue behaviors that hurt 
them (in net terms), and continue behaviors that are beneficial to them (in net terms). 
The conditionally cooperative behavior of types A in this example is in this sense closely 
related to generalized reciprocity mechanisms [ ] with low cognitive requirements. The 
assumption that individuals discontinue behavior after a single unsuccessful participation 
is a simplification. When this is not a realistic assumption, one can interpret the parameter 
T as the ratio between the typical number of repetitions of the activity and the typical 
number of unsuccessful attempts before cooperation is discontinued by a type A. 
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Figure 1: Payoff (marginal fitness) profiles for IPG. Payoffs v£ for types A are represented 
by black squares, while red circles depict payoffs vj? for types N. In these pictures, n = 20, 
C = 1, B = 5, and a = 4. From left: T = 1, T = 2 and T = 5. 



(THR) Threshold model: 

A _ f -C, if k < 0, 

Vk ~ \ -C + A, if k>9, 

N j 0, if k<0, 
Vk \ A', if k > 0, 
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for positive constants C, A and A', and an integer 9 e {1, 2, n}. The idea here is simple: 
the allele A carries a cost, but allows its carriers to gain benefits if sufficiently many are 
in the group. Unless otherwise stated, we assume that A > C. Types N obtain benefits 
also when types A do, but we allowed for the possibility that these benefits are different 
from those of the types A. These payoff functions can be seen as simplifications of more 
realistic ones, in which payoffs to types A initially grow slowly with fc, then steeply, and 
then quickly saturate. For instance, hunting of large pray may require a minimum number 
of hunters, but above that threshold number there can be little benefit to adding more 
hunters. Also the model of coordinated punishment of [6] provides an example of this sort. 
In [ ] a class of models that bridge between the PG and the THR was introduced and their 
relevance discussed. Similarly to the results of [ ] and [ ] for the IPG, it was shown in 
[2] that the THR can have polymorphic equilibria, with types A and N coexisting, when 
assortment is random (m = 1 case, trait-group framework). In this case, as for the IPG, 
types A can nevertheless never proliferate when p is small (the equilibrium with no types 
A, p = 0, is stable). In [49], the case n = 3 of the threshold model (called there "stag hunt 
game") was discussed in connection to the conceptual issue of the role of Hamilton's rule. 

The THR behaves in a very simple way under iteration. Suppose that the game is 
repeated T times over a life-cycle. And suppose that types A are "payoff dependent con- 
ditional cooperatores" , meaning that a type A cooperates in the first round, and in later 
rounds cooperates if and only if its payoff in the previous round was positive. The result 
is that types A cooperate exactly once if k < 0, and cooperate T times if k > 9. The total 
payoffs to types A and to types N over the life-cycle are then those of a THR with the same 
value of C, with A replaced by T(A — C) + C and A' replaced by TA'. Fig. 2 illustrate 
the case in which for the base game n — 10, 9 — 4, C = 1, A = A f = 2, so that the game 
iterated T times has the parameters as indicated in the figure caption. 



20 
15 
10 

5 


-2. 



1 1 1 1 




T 1 




















1 ■ f , 



3 5 7 9 
k 



20 
15 
10 

5 



-2. 




5 7 9 
k 



20 
15 
10 
5 



-2 





T 


= 10.. 














1 " f 





3 5 7 9 
k 



Figure 2: Payoff (marginal fitness) profiles for iterated THR. Payoffs v£ for types A are 
represented by black squares, while red circles depict payoffs vj? for types N. In these 
pictures n = 10, 9 = 4, A = T + 1 and A f = 2T. From left: T = 1, T = 3 and T = 10. 



Numerical analysis. Figures: We restricted ourselves to Fisher- Wright intra-group 
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reproduction in the computations. So, strictly speaking, the figures below refer to 21FW 
generalized by Gen.l, Gen. 2 and Gen. 3 and not Gen. 4. But, as explained in Section 5 and 
Appendix B, when n is large these results are also good approximations to a broad class of 
intra-group reproductive systems addressed in Gen. 4, provided we set n = n e g, as defined 
at the end of Section 3. 

In the figures below, the equilibria (indicated by the frequency p of types A) where 
obtained from solving (15), and their stability was obtained from the sign of D(p) in their 
neighborhood (decreasing D(p) =^> stability, increasing D(p) =^> instability). Instead 
of the migration rate m, or the gene flow nm ) we expressed the results as functions of 
the relatedness parameter i?, given by (9). This allows easier comparison with F$t data 
[17] (Tables 6.4 and 6.5), [ ] (Table 4.9) and [ ] (see Appendix D for the relation between 
relatedness and F$t)- We computed R s independently of the computations of the equilibria, 
by the methods of [ ] (see Appendix E). A similar computation was made of the point 
Rf , above which types N cannot invade a population of types N (i.e., above which the 
point with p — 1 is stable) . These values are indicated with dashed vertical lines (red and 
magenta) on the graphs that display the equilibria as functions of R. The agreement with 
the stability of the equilibria p = and p — 1, obtained from (15) is clear in the pictures. 
We are not aware of any general procedure for computing i?/, since it is defined in global 
terms (above it, the equilibrium with p = 1 becomes the only stable equilibrium). But 
typically we have Rf = max{i? 5 , i?^}, which happens in all our pictures, and allows for the 
determination of Rf in these cases. Most commonly we observed Rf = Rf , but in Fig. 8 
we have an exception. 
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Figure 3: IPG: n = 10, C = 1, B = 3, a = 4. (Top Left) T = 1 (in this panel 
IPG = PG): R s = R f = 33% (m s = m f = 8.7%). (Top Right) T = 10: i? s = 14.8% 
(m s = 20.3%) and R f = 27.6% (m/ = 11%). (Bottom Left) T = 100: R s = 6.8% 
(m s = 35%) and i?/ = 27.2% (m s = 11.2%). Open cian circles represent unstable fixed 
points, black stars represent stable fixed points. The critical relatedness for invasion by 
types A, R s , is indicated by a red dashed line while the critical relatedness for invasion 
by types N, R^ , is depicted as a magenta dashed line. The pictures show that here 
Rf = -Rf . (Bottom Right) Same parameter values from bottom-left panel: profiles for 
D{p) = p(l - p)[V A (p) - V N (p)] at (from bottom to top) R = 1% (blue), 5% (green), 10% 
(red), 15% (cian), 20% (magenta), 25% (yellow) and 30% (black). 
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Figure 4: IPG: n = 18, C = 1, B = 3, a = 4. (Top Left) T = 1 (in this panel 
IPG = PG): R s = R f = 33% (m s = m f = 5.1%). (Top Right) T = 10: i? s = 18.4% 
(m a = 10.4%) and R f = 32.2% (m f = 5.3%). (Bottom Left) T = 100: i? s = 13.5% 
(m s = 14.1%) and Rf = 32.1% (m s = 5.4%). Open cian circles represent unstable fixed 
points, black stars represent stable fixed points. The critical relatedness for invasion by 
types A, R s , is indicated by a red dashed line while the critical relatedness for invasion 
by types N, R^ , is depicted as a magenta dashed line. The pictures show that here 
Rf = -Rf . (Bottom Right) Same parameter values from bottom-left panel: profiles for 
D{p) = p(l-p)[V A (p)-V N (p)] at (from bottom to top) R = 10% (blue), 15% (green),20% 
(red), 25% (cian), 30% (magenta), 35% (yellow) and 40% (black). 
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Figure 5: IPG: n = 50, C = 1, B = 3, a = 16. (Top Left) T = 1 (in this panel 
IPG = PG): R s = R f = 33.3% (m s = m f = 1.95%). (Top Right) T = 10. R s = 17% 
(m 5 = 4.55%) and R f = 31.4% (m f = 2.12%). (Bottom Left) T = 100: i? s = 9.76% 
(m 5 = 8.1%) and R f = 31.1% (m/ = 2.14%). (Bottom Right) T = 1000: i? s = 6.3% 
(m 5 = 12.3%) and Rf = 31.1% (m/ = 2.14%). Open cian circles represent unstable fixed 
points, black stars represent stable fixed points. The critical relatedness for invasion by 
types A, is indicated by a red dashed line while the critical relatedness for invasion by 
types N, i?^, is depicted as a magenta dashed line. The pictures show that here Rf = R^ . 
This model with these parameters are the same as in Fig. 2 in [ ], which is restricted to the 
problem of invasion by types A, but includes strong selection and comparison with the limit 
of large n. In this limit, it follows from (26), as computed in the supplementary material 
of [ 4], that for large T we have R s = — ln(l — C/B)/Yn(T) approximately. Fig. 2 in [44] 
indicates that for T = 10, 100, 1000, this approximation is very good. As also observed in 
[44], in this example, types A are strongly altruistic, in the sense that v£ < v^_ x (each type 
A would be better off mutating into a type N). 
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Figure 6: THR: n = 10, 9 = 4, C = 1, A = T + 1, A' = 2T. (Top Left) T = 
1 (A = 2, A' = 2): R s = 32.6% (m s = 9%) and R f = 57.2% (m f = 3.5%). (Top 
Right) T = 10 (A = 11, A' = 20): R s = 8.95% (m s = 29.6%) and R f = 57.2% 
(m f = 3.5%). (Bottom Left) T = 100 (A = 101, A' = 200): R s = 2.9% (m s = 52.1%) 
and R f = 57.2% (m/ = 3.5%). (Bottom Right) T = 300 (A = 301, A' = 600): 
i? s = 1.73% {m s = 61.2%) and Rf = 57.2% (mf = 3.5%). Open cian circles represent 
unstable fixed points, black stars represent stable fixed points. The critical relatedness for 
invasion by types A, R s , is indicated by a red dashed line while the critical relatedness for 
invasion by types N, i?^, is depicted as a magenta dashed line. The pictures show that 
here R f = i?f . 
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Figure 7: THR: n = 30, = 10, C = 1, A = T + 1, A' = 2T. (Top Left) T = 
1 (A — 2, A' — 2): R s = 33.5% (m s = 3.2%) and R f = 61.4% (m f = 1%). (Top 
Right) T = 10 (A = 11, A' = 20): i? s = 11.5% (m s = 10.7%) and R f = 61.4% 
(m/ = 1%) (Bottom Left) T = 100 (A = 101, A' = 200): i? s = 5.5% (m s = 20.2%) 
and R f = 61.4% (m/ = 1%). (Bottom Right) T = 300 (A = 301, A' = 600): 
R s = 4.2% (m s = 24.7%) and i?/ = 61.4% (m/ = 1%). Open cian circles represent 
unstable fixed points, black stars represent stable fixed points. The critical relatedness for 
invasion by types A, R s , is indicated by a red dashed line while the critical relatedness for 
invasion by types N, i?^, is depicted as a magenta dashed line. The pictures show that 



here R f = R? . 
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Figure 8: THR: n = 10, 6 = 8, C = 1, A = T + 1, A' = 2T. (Top Left) T = 1 (A = 
2, A' = 2): R a = R f = 64.2% (m s = m f = 2.7%) and B*j = 22.1% (mf = 14%). 
(Top Right) T = 10 (A = 11, A' = 20): R s = R f = 31% (m s = m f = 9.5%) 
and f?f = 22.1% (mf = 14%). (Bottom Left) T = 100 (A = 101, A' = 200): 
i? s = 16.1% {m s = 18.9%) and R f = i?f = 22.1% (m/ = mf = 14%). (Bottom Right) 
T = 300 (A = 301, A' = 600): i? s = 12.3% (m s = 23.6%) and R f = i?f = 22.1% 
(mf = mf = 14%). Open cian circles represent unstable fixed points, black stars represent 
stable fixed points. The critical relatedness for invasion by types A, R s , is indicated by a 
red dashed line while the critical relatedness for invasion by types N, i?f , is depicted as 
a magenta dashed line. The top pictures show Rf = R s , while the bottom pictures show 
Rf = i?f . 
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7 Conclusions 



For a large class of group structured populations, involving competition within groups 
and possibly also among groups, and/or allowing for elasticity in group size with local 
regulation, the equilibria under weak selection are obtained from equating Ap = in (14), 
or equivalently, D(p) = in (15) (with inputs from ((12), (13) and (3)). The stability of each 
one of these equilibria is obtained from the sign of Ap, or equivalently, the sign of D(p), close 
to it. In particular rare mutant alleles A will invade when (16) holds. If groups are large 
and the migration rate low, these conditions take simpler forms, provided by (23), (24) and 
(25), or (26), in which only the scaled gene flow parameter mn e Q appears, or, equivalently, 
only the relatedness parameter R = 1/(1 + 2mn e fi) appears. All these conditions are 
easy to apply, providing tools that can address any sort of intra-group interaction and the 
complexities that result from multi-individual, possibly iterated, games accross a life-cycle, 
including contingencies of behavior [20, 7, 25, 3, 18, 12, 49, 2, 13, 8, 45, 36, 19, 4, 5, 6]. In this 
way one can study biological models of social evolution in group-structured populations, in 
which gene action is non-additive, producing marginal fitness functions under weak selection 
that are non-linear functions of group composition. Our methods can therefore be used 
when approaches based on differentiability of fitness functions [48, 11, 51, 12, 40, 27, 28] 
are not applicable, as explained in [43]. 

Under the two conditions: (CI) isolated mutants have lower fitness than the wild type; 
(C2) mutants that are in groups with no wild types have fitness that is larger than the 
wild types, the following regimes occur, when we start from a small fraction of mutants. 
(1) No invasion possible for high gene flow; (2) invasion leading to fixation for low gene 
flow; and possibly also: (3) invasion leading a polymorphic equilibrium for intermediary 
levels of gene flow. The three regimes are often present for iterated public goods games 
with contingent cooperation, and for threshold models, as we observed in Section . This 
contrasts to what happens with linear public goods games, for which there is no possibility 
of polymorphic equilibria under weak selection, suggesting that non-linearities in fitness 
functions may be present when such equilibria are observed. (Compare with [39] where 
strong selection was proposed as an explanation.) Invasion of the cooperative types A can 
occur in these models under modest levels of group relatedness, compatible with values 
observed in several species [ ] (Tables 6.4 and 6.5), [ ] (Table 4.9) and [ ]. This result 
extends to a broad class of population structures the main conclusion in [44], showing 
that population viscosity, without kin recognition, can produce levels of genetic assortment 
that are sufficient for the spread of cooperative/altruistic intra-group behavior. In other 
words, contrary to a widespread claim [33, 53, 16, 34, 1, 9, 24, 29, 35, 30, 52] the biological 
conditions for "the good of the group to override the interest of the individual" are not 
stringent. The opposite conclusion had been obtained and reinforced over the years based 
on the analysis of special models, mostly variants of a linear public goods game. The 
possibility of correcting that misunderstanding using the techniques from [44] and the 
current paper highlights the importance of having available good methods for analyzing 
non-linear marginal fitness functions. 
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8 Appendix A. Variable group size 

The kind of population structure that is being considered in a simplified fashion in Gen. 2 can 
be described as follows. There is a typical group size n , groups with fewer members tend 
to generate larger groups in the next generation (possibly because resources are abundant 
for the small group), while groups with more members tend to generate smaller groups in 
the next generation (possibly because resources are scarce for the large group). This can 



29 



be modeled by fitness functions w^ k = (1 + 8v^ k )h{n/n^) and w^ k = (1 + 8v^ k )h{n / n^) ) 
with a function h(s) that is decreasing and takes the value 1 at s = 1. If the typical group 
size, n , is large, the law of large numbers implies that each group that is created has size 
close to its expected value. In this case we should expect group sizes that are generally 
not far from n , in relative terms. A full analysis of this framework for moderate sizes of 
n would nevertheless involve the evolution of group sizes. For this reason, we provided a 
more idealized, but also more tractable approach, in our generalization Gen. 2 of 21FW. 

We present now in detail an instance of the mathematical assumptions introduced in 
Gen. 2, so that we can be sure that this can be done in a mathematically sound way. Our 
goal is primarily to be sure of the mathematical consistency of the assumptions made in 
Gen. 2, not aiming now for biological realism. The reader should also keep in mind that in 
the analyzis in the paper, details as those discussed next are not relevant. This robustness 
of the methods presented in the paper is one of their strengths. Our example will be 
mathematically as simple as possible. We suppose that each group creates in the average 1 
group in the next generation and that groups always have one of the three sizes n , n — 1, 
or n + 1. (The notation n replaces the n from the paper in this appendix, so that here 
we can use n as a free variable to designate an arbitrary group size.) We will suppose that 
for the possible values of n and fc, the fitness functions are as presented in the previous 
paragraph. Therefore w n ^ k = (1 + 5v no ^)h(n/n ), where v n ^ = (kv^ k + (n-k)v^ k )/n. The 
assumptions on the function h imply that w no ^ = 1 + 8v no ^) and for small 8 > we have 
w no -i,k > 1 and Wn +i,k < 1, for all k. 

We want to check if the transition probabilities Pr(n / |n, k) that give the distribution of 
the size n' of an offspring group, conditioned on the size n of its parental group and the 
number k of members of that group that are types A, can satisfy the conditions proposed 
in Gen. 2. For this we first write Pr(n |n ,fc) = 1 — a k — Pk-> Pr(n — l|n ,fc) = a k and 
Pr(n + l|n , k) = fa- Then, for given fitness functions as above we must find probabilities 
ak and both of order 5, so that we obtain the proper average expected number of 
offspring of a group of size n that includes k types A. This means that we must solve 
n (l — ak — Pk) + ( n o — l)<2fc + (no + l)Pk = ^o(l + 8vn ,k)- This has many solutions when 
S > is sufficiently small, the simplest one being ak = —Snov no ^, Pk — if v nQj k < 0, 
and Pk = 5no^ no ,/c, ak = if v nQi k > 0. Biologically this means that a group of size 
no typically creates only groups of size n , but when the average fitness in the group is 
slightly larger (smaller) than 1, then the group creates with some small positive probability 
a group with one more (one less) member. Next we have to specify what Pr(n / |n — 1, k) 
and Pr(n / |n + 1, k) are. We only do it for the former, since the latter is analogous. We 
write Pr(n — l\n — 1, k) — 1 — 7&, Pr(n |n — 1, k) — jk- We have to be sure that, when 
S is small, for each value of fc, we can find probabilities 7& > so that (n — l)w no -i^k = 
(1 — 7fc)(no — 1) + 7fc7io. This is equivalent to jk = (^o — l)(^n -i,fc — !)• I n case ( an d only in 
case) WnQ-ifi is slightly larger than 1 for each value of fc, the corresponding 7^ are between 
and 1, as we needed. This condition on w no -i^ means that 1 < /i((n — l)/n ) <n /(n — 1), 
and means that groups of size n — 1 should be more productive than groups of size n , 
but imposes an upper bound on by how much. An analogous condition holds on the other 
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side: n /(n + 1) < h((n + 1)) < 1. (If these condition on h fail, one needs to allow more 
values of n to be reached in the model.) 

Having specified Pr(n / |n, k) in the previous paragraph, and assuming that intragroup 
competition is modeled by Fisher- Wright sampling (i.e., we are not in the more general 
setting of Gen. 4), we have the complete transition matrix: 

Pr(n / , fc'|n, k) = Pr(n / |n, k) bin(/c / |n / , kw^ k /nw n ^)- This provides us with a well defined 
stochastic process in which each individual in each generation has an expected number of 
offspring given by the proposed fitness functions w^ k and w^ k . (Keep in mind that migra- 
tion among groups, after they are created, connects the g groups, so that the stochastic 
process is a Markov chain, but only in a very large state space involving the whole popu- 
lation. Under weak selection, though, we will be able to consider the migrants to a focal 
group as coming from a metapopulation with fixed frequency p of types A, and in this way 
effectively reduce the analysis to that of a Markov chain acting on a single focal group.) 

There is a particularly nice twist to the framework under Gen. 2. The fitness functions 
w^ k and w^ k in this setting can be absolute fitnesses (expected number of offspring of 
and individual), not just relative fitnesses. In 21FW, with fixed total population of size 
ng, the average absolute fitness must always be 1, but this is not the case under Gen. 2. 
And in the example in the previous two paragraphs, w^ k and w^ k are indeed absolute 
fitnesses. (This is what we assumed, when we wrote equations that had to be satisfied by 
the transition probabilities.) This fact can be puzzling at first sight. If w£ k and w^ k are 
absolute fitnesses and we have, say v^ n > and types A fixated (every individual is type 
A), then it would seem that everyone has an average fitness larger than 1 in a population 
in equilibrium, something that cannot happen. The (incorrect) reasoning behind this idea 
is that in equilibrium, with small 5 > 0, almost all groups are of size n (correct) and that 
for this size w^ no = 1 + Sv^ > 1 (correct). So we seem to conclude that the average 
fitness in equilibrium must be larger than 1 by an amount of order S. But in this reasoning 
we are forgetting that even if only a fraction of order 5 of groups in equilibrium have size 
no + 1, individuals in these groups have fitness ^ +1 +1 = (1 + 5v^)h((n + l)/n ) < 1. 
This provides a term to be added to the average fitness that is below 1 by an amount of 
order 5] sufficient to explain how such an equilibrium is possible and has the necessary 
average absolute fitness 1. Most groups will have size n , and its members produce slightly 
more then one offspring. But the few groups that have size n + 1 compensate this push 
upwards, since their members produce in the average a number of offspring below 1 by a 
fixed amount. 

It is also very intructive to compare the situation in which types N are fixated with that 
in which types A are fixated. Under the assumption that v% = and, as above, v£ n > 0. 
In each one of these fixated equilibria, the average absolute fitness is by necessity 1. But 
the average size of the population is larger in the case A is fixated. When N is fixated 
groups only have size n in equilibrium, while when A is fixated groups have size n or 
n + 1 in equilibrium. In this case only a fraction of order 5 of groups have size n + 1, so 
that the average group size exceeds 1 only by an amout of that order, but it does exceed 
it. 
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9 Appendix B. General intragroup transition matrix 



Here we provide some details on Gen. 4. First note that if one is in the setting of Gen. 2, 
or Gen. 3, then one would in principle have to define transition probabilities from (n, k) to 
and that the distribution of i may depend on n in addition to k. But because we 
are only concerned with weak selection in this paper, we will only need to consider the 
transition matrix in case 5 = (small 5 is treated as a perturbation), and in this case, even 
under Gen. 2 and Gen. 3, we are restricted to groups of a fixed size n. For this reason, we 
use the notation P ki for the 6 = case. Note that P^^ is a Markov transition matrix, and 
recall that we are assuming ^ iPk,i = k (when 5 = 0, A and N are neutral markers) and 
^b,o = P n ,n = 1 (the states and n are traps for the Markov chain generated by Pk,i)- 

Under the generalization Gen. 4, (added possibly to Gen.l, Gen. 2 and Gen. 3) the Markov 
transition matrix T^, that appears in (1) and (3) takes the form 

T jjk = ^ P j ^ i bm(i , \i,m)bm(i ,, \n-i,m)bm(k-i + i^ (27) 

Explanation: with probability P^, the new group is created with i types A and n — i 
types N, of which, respectively, i' and i" migrate and are replaced with migrants from 
the metapopulation. The number k of types A in the group after migration is then the 
sum of i — i' (non-migrants) and a binomial random variable corresponding to i f + i" 
attempts, each with probability p of success (migrants). (We use the standard convention 
that bin(i|n, q) = 0, when i < or i > n.) 

All the claims in Sections 3 and 4 hold, with the same arguments given there and with 
(fi(p) being the (unique once m > 0) stationary distribution of the Markov chain with 
transition matrix Tj^ given in (27). Also the statements in Appendices C, D and E hold 
without changes in the argumentation. 

We present now two types of examples of intragroup transition with substantially dif- 
ferent biological meanings. In one important class of examples, each individual in the 
parental group can be thought of as creating an independent random number of offspring 
with a certain distribution (independent of n) with mean proportional to individual fitness, 
conditioned on the total number of offspring created being n (or whatever the group size 
is in case of Gen. 2). When the individual offspring distribution is Poisson we have the 
Fisher- Wright case. We will refer to these examples as the case of "distributed" production 
(or transition) scheme. 

In contrast, one member of the parental group can be chosen at random, with probability 
proportional to individual fitness, and mother the whole new group. This example, with 
high reproductive skew, will be referred to as "concentrated" production (or transition) 
scheme. 

The distributed and concentrated intragroup transition schemes illustrate different as- 
pects of the generalized 21FW framework. In the concentrated case, we can easily express 
Pk,i and use it to compute the corresponding Tj^ and (fk(p)- This is not the case in the 
distributed case, for which P^^ is usually difficult to compute. But in contrast to the 
concentrated case, the distributed case (supposing that the offspring distribution for the 
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individuals has a finite second moment) satisfies the conditions for Wright's beta approx- 
imation (22) to (fk(p) to apply. This is so because in this case, when n is large, there is 
sufficient independence among the types of the members of the offspring group (conditioned 
on the fraction of types A in the parental group), for the diffusion approximation used to 
derive that beta distribution to be applicable. This adds significantly to the relevance of 
the Fisher- Wright case and the beta approximation results in Section 5. They represent in 
good approximation a broad class of intragroup transition schemes, when n is large. 

10 Appendix C. Graphical construction of the equi- 
libria 

The equilibria tp(p) = (ipo(p), ••-, ( fin(p)) can be easily computed using ( ). (This is not so 
easy in case of Gen. 4, but can still be done using (27).) This equation can also be used 
to derive the properties of tp(p). Here we present an alternative way of describing these 
equilibria, that will appeal to readers who are fond of graphical devices, and that provides 
additional intuition. 

The idea is to partition the members of a group into classes of equivalence defined by 
the property of being IBD. In each one of these (IBD)-classes, all the individuals will be of 
type A, or all will be of type N, with respective probabilities p and 1 — p, independently 
from class to class. 

To obtain the (IBD)-classes, we should follow the lineages of the members back in time 
and record coalescence events and migration events. Once there is a migration event in 
a lineage, we can stop following this lineage. The migration even means that this lineage 
reached the group in that generation. All members of the (IBD)-class of descendents from 
this migrant will therefore be of the same type as this migrant. Because we are considering 
equilibrium, with frequency p of types A, this type will be A or N with probability p or 1 —p. 
And because the population is very large and migration is random, the different migrants 
to a group are type A or N independently of each other, implying that the IBD-classes are 
type A or N independently of each other. 

The benefit of looking at the equilibria Lp(p) in the fashion above, is that it allows one 
to see what happens in special cases, especially extreme cases, easily. When p is close to 
(or 1), all the (IBD)-classes will be likely to have individuals who are type N (or type 
A). This explains (5) and (6). When m — 1 all individuals are migrants in generation 
t. Therefore each individual is in a different (IBD)-class, and clearly (?) holds. In the 
opposite extreme, when m —} 0, all the lineages of the members of the group are likely to 
coalesce before migration, we obtain then, with overwhelming probability, a single (IBD)- 
class. With probability this class will contain only types A, while with probability 1 — p 
it will contain only types N. This explains (8). 
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11 Appendix D. Relatedness 



Select a group at random in generation i, and from this group select in order two distinct 
individuals (the focal and the co-focal). Define p as the fraction of types A in that group, n 
as the fraction of types A among the n — 1 individuals in this group that are distinct from 
the focal individual (this is the focal's social environment), A\ as the event that the focal 
individual is type A, A 2 as the event that the co- focal individual is type A. Define i — 1, 2 
as the random variable that takes value 1 when Ai happens and value otherwise. Say 
also that the focal and the co-focal individuals are identical by descent (IBD), if following 
their lineages back in time, they coalesce before either one experiences a migration event. 

Relatedness can be defined as certain regression coefficients, namely the regression of 
I 2 on Ii or, equivalently, that of k on I\\ 

R veg = = = miMi) - miMD 

= 1E(k\Ai) — IE{k\A\) = = p K , h . (28) 

Here the first, the second and the last equalities are definitions, and the others are elemen- 
tary probability identities. Alternatively, relatedness can be defined by 

i? IBD = Pr (focal and co-focal are IBD). (29) 
Wright's Fst statistics can be defined by 

When the sampling of the focal and co- focal is done in one of the equilibria (p(p), the 
following well known relationships hold among the three definitions above and (9): 

R = i? IBD = R^ = (31) 

n — 1 

For the reader's benefit, we present short derivations next. 

The first equality in (31), is a consequence of the following equilibrium recursion. The 
focal and co-focal will be IBD if and only if neither one is a migrant (probability (1 — m) 2 ) 
and they either have the same mother (probability l/ n eff)' or ^ ave distinct mothers 
that are IBD (probability (1 - (l/n eff ))i? IBD ). Hence R IBD = (1 - m) 2 ((l/n eff ) + (1 - 

(l/n eff ))i? IBD , from which we get R lBD = R, given by (9). 

To derive the second equality in (81), let D be the event that focal and co-focal are IBD. 
Because we are sampling from the equilibrium (p(p) : we know that the population has been 
in this equilibrium for several (>> 1/m) generations prior to the sampling. Therefore, 
Py(AiA2\D) = p and Py(AiA 2 \D c ) = p 2 . (We use the standard convention of writing 
An B simply as AB. One way to thing about these conditional probabilities is in terms 
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of the (IBD)-classes defined in Appendix C. The even D means that the focal and cofocal 
individuals are in the same (IBD)-class. In this case they are of the same type, while in 
the opposite case, they are of independent types.) Hence Cov(/ 2 ,^i) = P^(AiA 2 ) — p 2 = 
pR lBB + p 2 (l - i? IBD ) - p 2 = p(l - p)R lBB = Var(/!)i? IBD , which yields the second 
equality in (31). 

The third equality in (31), does not depend on having sampled in equilibrium, and 
results from the following elementary probability identity: 

Var( ^ = nVaijh) + w(n-l)Cov(/i,/ 2 ) = Vaxjh) (1 + (n - l)R re &) 

n 2 n 

where we used (28) in the second step. Using the definition ( ), this becomes Fst = 
(1 + (n — l)i? re S)/n, which is equivalent to the last equality in (31). 

12 Appendix E. Condition for invasion based on IBD 
distribution 

Display (2) in [44] implies that under weak selection, the condition for invasion by allele A 
(instability of the equilibrium with p — 0) can be written as 

n 

> 0, (33) 

k=l 

where n is the stationary distribution of the Markov chain on {1, with transition 

matrix 

Qij = mlj + (1 — m) bin(j — 1 1 n — 1 , (1 — m)i/n), (34) 

with notation: lj = 1 if j — 1 and lj = if j ^ 1. Therefore n can be computed from the 
stationarity condition n = ttQ and the normalization condition J2k=i n k — 1- 

The distribution tt has a very simple and biologically natural interpretation, that we 
recall next. Two individuals are said to be IBD if following their lineages back in time, 
they coalesce before a migration event affects either one. Then, as we explain in [44], 7i> 
is the probability that if we choose at random a focal individual, it will have exactly k 
individuals in its group that are IBD to it (self included). For this reason, we refer to n as 
the IBD distribution. 

Comparing (16) with (33), we see that since v£ is arbitrary, the term inside parentesis 
in (16) must equal 71^. But the computation of tt as the stationary distribution of Q is 
more straightforward. For this reason we used (33) to compute the values of rn s and R s in 
our examples in Section 6. This is done by replacing (33) with the corresponding equality 
and finding the value of m = m s that solves this equation. (Recall that R s is then defined 
by pluging in m = m s in (9)). 

In the same way that one asks for conditions for the stability of the equilibrium with 
p = 0, one can ask the analogous question about the equilibrium with p — 1. When can 
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allele N invade a population in which there are only types A? We define (and the 
corresponding R^) as the least amount of migration (largest degree of relatedness) for 
which this invasion can happen. Since in the derivations of (16) and (33), there is nothing 
that qualitatively distinguishes types A from types N, we can use these invasion conditions, 
with the appropriate quantitative modifications, to compute and . The version of 
the condition for invasion of allele N based on (33) reads 

n 

> vl (35) 

k=l 

The only subtlety is the fact that the right-hand-side in (33) is Vq = 0, that has to be 
replaced with v A in (35). Indeed, the meaning of the left-hand-side in (33) is the average 
marginal fitness of a focal type A (since it is rare, only individuals who are IBD to it in 
its group are also types A). And the meaning of the right-hand-side in (33) is the average 
marginal fitness of a focal type N, in the population dominated by types N and with few 
invading types A. That fitness is essentially since almost all types N then are in groups 
with no types A. This makes (33) intuitive and an analogous reasoning with the roles of A 
and N interchanged makes (35) equally intuitive. The value of is now obtained from 
solving for equality in (35), and R^ is then defined by pluging in m = in (9)). 



13 Appendix F. Linear public goods game and relat- 
edness 

In the basic example of the PG, v£ = —C + B{k — l)/(n — 1) and vj? = Bkj (n — 1), it is 
well known [16] that (14) takes the simple form 

Ap = Sp(l-p)(-C + BR) + 0(6 2 ). (36) 

Hamilton [ ] derived (36) using the Price equation and the relationship between R and 
Wright's Fst statistics. For completeness, and for the reader's benefit, we present next two 
alternative derivations of (36). 

A simple way to derive ( 16) from (14) is to directly compute V A {p) — V N {p) ) using 
their definitions, (12) and (13). Using also (4), we obtain, after some standard algebraic 
manipulations, 



n 

V A (p)-V N (p) = -C + B- 



p(i-p) 



- 1 



77,-1 



= -c + B nFsT - 1 = -C + BR, (37) 
n — 1 



where we used (31) in the last step. 
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An alternative derivation of (36) from ( I 4) is to observe that the payoff to the focal 
individual can be written as —Cl\ + Bk, and therefore 



V A (p)-V N (p) = E(-C + Bk\Ax) - E(Bk\A 2 ) 

= —C + B [1E(k\Ai) — IE(k\A 2 )] 

= -C +B i? re g = -C + BR, (38) 

where we used (28) and (31). 
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